Goto

Collaborating Authors

 split conformal prediction


Conditional Predictive Inference for General Structured Data with Group Symmetries

arXiv.org Machine Learning

We study distribution-free predictive inference for data with group symmetries, aiming to establish near-conditional coverage guarantees beyond exchangeability for structured data. While many predictive inference methods achieve a target coverage level, most provide marginal coverage. In practice, conditional predictive inference is often preferred, as it quantifies uncertainty for black-box predictions given observed attributes, thereby accommodating heterogeneity. Although many efforts have pursued efficient conditional coverage, existing methods rely on the i.i.d. or exchangeable assumption, often violated in structured settings such as networks, clusters, and imaging data. Recently, SymmPI introduced a unified approach to predictive inference under group symmetries beyond exchangeability; nevertheless, its guarantees remain marginal and do not account for population heterogeneity. To bridge this gap, we introduce C-SymmPI, a framework that achieves near-conditional coverage under general data structures with group symmetries, extending beyond exchangeability to cover networks, cluster-level data, and related structures. Inspired by relaxed multi-accuracy, our approach reformulates conditional coverage as miscoverage error over a user-specified function class. We establish theoretical guarantees under distributional invariance and distribution shift, and derive convergence rates for linear and RKHS function classes, recovering state-of-the-art results in the exchangeable setting as special cases. For computational efficiency, we develop two variants: a projection-based algorithm for high-dimensional observations, and a sampling-based algorithm for large or infinite groups. We demonstrate effectiveness on hierarchical and network data. Empirical results show that C-SymmPI delivers more informative and stable conditional coverage with improved accuracy compared to existing methods.


Skew-adaptive conformal prediction

arXiv.org Machine Learning

We develop a skew-adaptive extension of split conformal prediction for regression. The method starts from an asymmetric interval family centered at a point prediction and uses the gauge approach to deduce the conformity score induced by this family. The inverse hyperbolic sine transform of signed scaled residuals provides the training target for an additional predictive model, whose role is to learn how predictive uncertainty should tilt across the feature space. The resulting procedure preserves the finite-sample marginal validity of split conformal prediction under exchangeability, while producing intervals that adapt to both local scale and local skewness. We also develop a calibration-sample-based estimator for comparing the expected relative future width of the skew-adaptive and classical scaled-score intervals. Experiments on a variety of datasets indicate gains in prediction interval efficiency over the scaled-score construction and conformalized quantile regression, and show that the proposed estimator closely matches the corresponding average width ratio observed on the test sample.


Conformalized Quantile Regression

Neural Information Processing Systems

Conformal prediction is atechnique for constructing prediction intervals that attainvalidcoverage infinite samples, without making distributional assumptions. Despite this appeal, existing conformal methods can be unnecessarily conservativebecause theyform intervals ofconstant orweakly varying length across the input space.



Predictive inference for time series: why is split conformal effective despite temporal dependence?

arXiv.org Machine Learning

We consider the problem of uncertainty quantification for prediction in a time series: if we use past data to forecast the next time point, can we provide valid prediction intervals around our forecasts? To avoid placing distributional assumptions on the data, in recent years the conformal prediction method has been a popular approach for predictive inference, since it provides distribution-free coverage for any iid or exchangeable data distribution. However, in the time series setting, the strong empirical performance of conformal prediction methods is not well understood, since even short-range temporal dependence is a strong violation of the exchangeability assumption. Using predictors with "memory" -- i.e., predictors that utilize past observations, such as autoregressive models -- further exacerbates this problem. In this work, we examine the theoretical properties of split conformal prediction in the time series setting, including the case where predictors may have memory. Our results bound the loss of coverage of these methods in terms of a new "switch coefficient", measuring the extent to which temporal dependence within the time series creates violations of exchangeability. Our characterization of the coverage probability is sharp over the class of stationary, $ฮฒ$-mixing processes. Along the way, we introduce tools that may prove useful in analyzing other predictive inference methods for dependent data.



Singleton-Optimized Conformal Prediction

arXiv.org Machine Learning

Conformal prediction can be used to construct prediction sets that cover the true outcome with a desired probability, but can sometimes lead to large prediction sets that are costly in practice. The most useful outcome is a singleton prediction-an unambiguous decision-yet existing efficiency-oriented methods primarily optimize average set size. Motivated by this, we propose a new nonconformity score that aims to minimize the probability of producing non-singleton sets. Starting from a non-convex constrained optimization problem as a motivation, we provide a geometric reformulation and associated algorithm for computing the nonconformity score and associated split conformal prediction sets in O(K) time for K-class problems. Using this score in split conformal prediction leads to our proposed Singleton-Optimized Conformal Prediction (SOCOP) method. We evaluate our method in experiments on image classification and LLM multiple-choice question-answering, comparing with standard nonconformity scores such as the (negative) label probability estimates and their cumulative distribution function; both of which are motivated by optimizing length. The results show that SOCOP increases singleton frequency (sometimes by over 20%) compared to the above scores, with minimal impact on average set size.




Improving the statistical efficiency of cross-conformal prediction

arXiv.org Machine Learning

Conformal prediction has emerged as a general and versatile framework for constructing prediction sets in regression and classification tasks (Shafer and Vovk, 2008). Unlike traditional methods, which often depend on rigid distributional assumptions, conformal prediction transforms point predictions from any prediction (or black-box) algorithm into prediction sets that guarantee valid finite-sample marginal coverage. Originally introduced by Vovk et al. (2005), it has become increasingly influential, with numerous methods and extensions being proposed since its introduction. In particular, full conformal prediction by Vovk et al. (2005), demonstrates favorable properties regarding the coverage and the size of the prediction set. However, these advantages are counterbalanced by a substantial computational cost, which limits its practical application.